An Efficient Approach to Scale up k-medoid based Algorithms in Large Databases

نویسندگان

  • Maria Camila Nardini Barioni
  • Humberto Luiz Razente
  • Agma J. M. Traina
  • Caetano Traina
چکیده

Scalable data mining algorithms have become crucial to efficiently support KDD processes on large databases. In this paper, we address the task of scaling up k-medoids based algorithms through the utilization of metric access methods, allowing clustering algorithms to be executed by database management systems in a fraction of the time usually required by the traditional approaches. Experimental results based on several datasets, including synthetic and real data, show that the proposed algorithm may reduce the number of distance calculations by a factor of more than a thousand times when compared to existing algorithms while producing clusters of comparable quality.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

COMPUTATIONALLY EFFICIENT OPTIMUM DESIGN OF LARGE SCALE STEEL FRAMES

Computational cost of metaheuristic based optimum design algorithms grows excessively with structure size. This results in computational inefficiency of modern metaheuristic algorithms in tackling optimum design problems of large scale structural systems. This paper attempts to provide a computationally efficient optimization tool for optimum design of large scale steel frame structures to AISC...

متن کامل

Medoid Queries in Large Spatial Databases

Assume that a franchise plans to open k branches in a city, so that the average distance from each residential block to the closest branch is minimized. This is an instance of the k-medoids problem, where residential blocks constitute the input dataset and the k branch locations correspond to the medoids. Since the problem is NP-hard, research has focused on approximate solutions. Despite an av...

متن کامل

Similarity-Invariant Sketch-Based Image Retrieval in Large Databases

Proliferation of touch-based devices has made the idea of sketchbased image retrieval practical. While many methods exist for sketch-based image retrieval on small datasets, little work has been done on large (web)-scale image retrieval. In this paper, we present an efficient approach for image retrieval from millions of images based on user-drawn sketches. Unlike existing methods which are sen...

متن کامل

Face Recognition Based Rank Reduction SVD Approach

Standard face recognition algorithms that use standard feature extraction techniques always suffer from image performance degradation. Recently, singular value decomposition and low-rank matrix are applied in many applications,including pattern recognition and feature extraction. The main objective of this research is to design an efficient face recognition approach by combining many tech...

متن کامل

Analysis of Clustering Techniques in VLSI Cell Partitioning

Circuit partitioning plays a dominant role in VLSI physical design of chips. In this paper the newly proposed rank based k-medoid clustering algorithm is discussed, in order to partition the combinational circuit based on their interconnection distance among cell groups. Clustering analysis of the given circuit ,partition the set of objects into non overlapping subsets. The proposed ranked k-me...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006